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Abstract 

It is shown that the diffusion equation and its adjoint (time reversed) equation 
can be derived with only a few assumptions, using an information-theoretic 
approach based on the principle of minimum Fisher information. 
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I. INTRODUCTION 

The derivation of the diffusion equation from a fixed end-point variational principle 
is well known |l||. The Lagrangian that is normally used leads simultaneously to two 
equations for two real functions: the diffusion equation for a function ip, and its adjoint 
(time reversed) equation for a function ip*. This Lagrangian is usually introduced formally, 
without physical justification (consider the following quote from Ref. ]]]]: "The introduction 
of the mirror- image field ip*, in order to set up a Lagrange function from which to obtain 
the diffusion equation, is probably too artificial a procedure to expect to obtain much of 
physical significance from it"). We wish to show that this Lagrangian results from applying 
an information-theoretic approach to the solution of the following interpolation problem. 

Consider an experiment, where the probability density p(x,t) and the average velocity 
field v(x,t) of a cloud of particles of mass m is measured at times to and £i(for simplicity, 
we consider only motion in one dimension). Assume that p satisfies the continuity equation 
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Without additional assumptions regarding the dynamics of the system, the problem of de- 
termining the probability density and velocity field at times t (where t < t < ti) can not be 
solved, since there are an infinite number of probability densities and velocity fields that will 
interpolate between the values measured at times t and t\. However, we would still like 
to find best estimates of p and v, perhaps by adding some assumptions about the physical 



processes that determine the motion of the cloud of particles, and by using some principle 
of inference to select the most likely probability distribution that might describe its evolu- 
tion. The main result of this paper is to show that the dynamics of such a system will be 
determined uniquely by the diffusion equation and its adjoint equation, 
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(where if) and if)*, defined by eqs. ( |I3|) and (fL4|), are real functions of p and a, and D/2m is 



the diffusion constant) provided we make the following two assumptions about the system: 
that the velocity field can be derived from a potential function a(x,t), according to 
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and that the probability density p that interpolates between times to an d t\ is the one that 
minimizes the Fisher information / associated with p, which we define by (see Appendix A) 
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The first assumption is equivalent to introducing a particular physical model, in which the 
motion of the cloud of particles corresponds to that of a fluid with no vorticity. The second 
assumption is an information-theoretical assumption. 

II. DERIVATION OF THE DIFFUSION EQUATION FROM A VARIATIONAL 

PRINCIPLE 

Eqs. (]I|) and (|j) lead to the continuity equation 
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Eq. (^) can be derived from the Lagrangian Lql by fixed end-point variation with respect 
to a, 
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Note also that fixed end-point variation with respect to p leads trivially to the Hamilton- 
Jacobi equation, 
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Therefore, variation of Lc^ with respect to both p and <r leads to the equations of motion 
for a classical ensemble, eqs. (H) and ©. There is still considerable freedom in the choice 
of probability density that can be used to describe the system, since it is only subject to 
eq. @. To derive the diffusion equation and its adjoint, we need to restrict the choice 
of probability densities using the principle of minimum Fisher information. We consider 
therefore the Lagrangian Lp, 
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The Lagrangian Lp equals Lcl plus an additional term proportional to the Fisher informa- 
tion /, 
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Fixed end point variation of Lp with respect to a leads once more to eq. (|6]), while variation 
with respect to p leads to a modified Hamilton- Jacobi equation that includes a term Q which 
is of the form of Bohm's quantum potential (but notice that it appears here within the 
context of a classical theory) , 
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with 
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Eqs. (H) and ([TT|) are identical to eqs. 
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and (|]) provided we set 
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i> = Vpe + ° /D , (13) 
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It can be shown (see Appendix B) that the Fisher information / increases when p is varied 
while a is kept fixed. Therefore, the solution derived here is the one that minimizes the 
Fisher information for a given a. 

III. CONNECTION TO BROWNIAN MOTION 

Although i\) is a solution of the diffusion equation, it will not correspond in general to 
the case of Brownian motion. Here, ip = y / pe +CT//D is proportional to the square root of 
a probability distribution, while in Brownian motion the probability distribution p is the 
function that satisfies the diffusion equation. The ip that we have derived here is essentially 
the "wave function" of Euclidean quantum mechanics ||. 

The case of Brownian motion corresponds to a particular solution of eqs. (|j) and (|TT|), 
one for which 

D 
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In this case, the velocity field then takes the form 

D dlnp 
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Eq. (|16|) is known as the osmotic equation, and v bm is the osmotic velocity. If we substitute 
<tbm into eqs. (|13|) and QUI) , the "wave functions" become 
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which solve eqs. (0) and (|3|) provided the probability density p is a solution of the diffusion 
equation. One can also check that eqs. (H) and (p~T| ) both reduce to 
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when eq. ( |15D holds. 



IV. DISCUSSION 

It has been shown that the diffusion equation and its adjoint (time reversed) equation can 
be derived using an information-theoretic approach that is based on the principle of minimum 
Fisher information. In the information-theoretic approach followed here, the emphasis is on 
using the principle of minimum Fisher information to complement a physical picture derived 
from a particular hydrodynamical model of the system. Variation of the Lagrangian (Q) 
can be interpreted as the minimization of the Fisher information subject to the constraint 
that the probability density satisfy the continuity equation (^), which arises naturally in the 
hydrodynamical model. An alternative approach to the diffusion equation that also uses 
minimum Fisher information can be found in Ref. @]. This derivation, however, differs 
from the present one in two crucial respects; in particular, the equation is not derived from 
a Lagrangian, and the derivation does not make reference to the hydrodynamical model. 

The approach followed here provides a physically well motivated derivation of the diffu- 
sion equation which distinguishes between physical and information-theoretical assumptions. 
A similar approach leads to the Schrodinger and Pauli |J equations. 

V. APPENDIX A 

Let /j be a measure defined on R", let P{y l ) be a probability density with respect to 
jj, which is a function of n continuous parameters y\ and let P(y l + Ay 1 ) be the density 
that results from a small change in the y l . Expand the P(y l + Ay 1 ) in a Taylor series, and 
calculate the cross-entropy up to the first non- vanishing term, 
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The terms in square brackets are the elements of the Fisher information matrix (while this 
is not the most general definition of the Fisher information matrix, it is one that applies to 
the present case 0. For the general case, see Ref. [[7]). 

If P is defined over an n-dimensional manifold M with (positive) metric g lk , there is a 
natural definition of the amount of information / associated with P, which is obtained by 
contracting the metric g lk with the elements of the Fisher information matrix, 

In the case where M is the n + 1 dimensional extended configuration space QT (with 
coordinates {£, x 1 , ...,x n }) of a non-relativistic particle of mass m, the natural metric is the 
one used to define the kinematical line element in configuration space, which is of the form 
g tk = diag(0, 1/m, ..., 1/m) Hj. Note that with this metric, it is straightforward to generalize 
the results of the paper to the case of diffusion in many space dimensions. In particular, we 
replace the velocity field in eq. (f|) by the expression v l = g lh da/dx k , and the Lagrangian 
in eq. (|) by 
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In the case of one time and one space dimension, eq.(|2~l"D reduces to eq. (||). 

To express I in units of energy, we need to introduce a conversion factor with units of 
action squared and multiply eq. ([5]) by this factor. In the case of the diffusion process, we 
can set the conversion factor proportional to D 2 , although it is also possible to introduce a 
universal constant of action, such as h, and set the conversion factor proportional to H 2 . 

VI. APPENDIX B 

We want to examine the extremum obtained from the fixed end-point variation of the 
Lagrangian Lp. In particular, we wish to show the following: given p and a that satisfy eqs. 
(^) and (ffT|), a small variation of the probability density p(x, t) -^ p(x, t)' = p(x, t) + e5p(x, t) 
for fixed a will lead to an increase in Lp, as well as an increase in the Fisher information /. 



We assume fixed end-point variations (Sp = at the boundaries), and variations eSp that 
are well defined in the sense that p' will have the usual properties required of a probability 
distribution (such as p' > and normalization). 

Let p — ► p' = p + e5p. Since p and a are solutions of the variational problem, the terms 
linear in e vanish. If we keep terms up to order e 2 , we find that 
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Using the relation 



we can write ALd as 
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which shows that ALp > for small variations, and therefore the extremum of ALp is 
a minimum. Furthermore, since ALd ~ D 2 , it is the Fisher information term / in the 
Lagrangian ALd that increases, and the extremum is also a minimum of the Fisher infor- 
mation. 
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